803 research outputs found

    Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm

    Full text link
    This paper introduces ICET, a new algorithm for cost-sensitive classification. ICET uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm. The fitness function of the genetic algorithm is the average cost of classification when using the decision tree, including both the costs of tests (features, measurements) and the costs of classification errors. ICET is compared here with three other algorithms for cost-sensitive classification - EG2, CS-ID3, and IDX - and also with C4.5, which classifies without regard to cost. The five algorithms are evaluated empirically on five real-world medical datasets. Three sets of experiments are performed. The first set examines the baseline performance of the five algorithms on the five datasets and establishes that ICET performs significantly better than its competitors. The second set tests the robustness of ICET under a variety of conditions and shows that ICET maintains its advantage. The third set looks at ICET's search in bias space and discovers a way to improve the search.Comment: See http://www.jair.org/ for any accompanying file

    How to Shift Bias: Lessons from the Baldwin Effect

    Full text link

    What makes it so hard to look and to listen? Exploring the use of the Cognitive and Affective Supervisory Approach with children’s social work managers

    Get PDF
    This paper reports on the findings of an ESRC-funded Knowledge Exchange project designed to explore the contribution of an innovative approach to supervision to social work practitioners’ assessment and decision-making practices. The Cognitive and Affective Supervisory Approach (CASA) is informed by cognitive interviewing techniques originally designed to elicit best evidence from witnesses and victims of crime. Adapted here for use in childcare social work supervision contexts, this model is designed to enhance the quantity and quality of information available for decision-making. Facilitating the reporting of both ‘event information’ and ‘emotion information’, it allows a more detailed picture to emerge of events, as recalled by the individual involved, and the meaning they give to them. Practice supervisors from Children’s Services in two local authorities undertook to introduce the CASA into supervision sessions and were supported in this through the provision of regular reflective group discussions. The project findings highlight the challenges for practitioners of ‘detailed looking’ and for supervisors of ‘active listening’. The paper concludes by acknowledging that the CASA’s successful contribution to decision-making is contingent on both the motivation and confidence of supervisors to develop their skills and an organisational commitment to, and resourcing of, reflective supervisory practices and spaces

    Inducing safer oblique trees without costs

    Get PDF
    Decision tree induction has been widely studied and applied. In safety applications, such as determining whether a chemical process is safe or whether a person has a medical condition, the cost of misclassification in one of the classes is significantly higher than in the other class. Several authors have tackled this problem by developing cost-sensitive decision tree learning algorithms or have suggested ways of changing the distribution of training examples to bias the decision tree learning process so as to take account of costs. A prerequisite for applying such algorithms is the availability of costs of misclassification. Although this may be possible for some applications, obtaining reasonable estimates of costs of misclassification is not easy in the area of safety. This paper presents a new algorithm for applications where the cost of misclassifications cannot be quantified, although the cost of misclassification in one class is known to be significantly higher than in another class. The algorithm utilizes linear discriminant analysis to identify oblique relationships between continuous attributes and then carries out an appropriate modification to ensure that the resulting tree errs on the side of safety. The algorithm is evaluated with respect to one of the best known cost-sensitive algorithms (ICET), a well-known oblique decision tree algorithm (OC1) and an algorithm that utilizes robust linear programming

    Case-Based Translation: First Steps from a Knowledge-Light Approach Based on Analogy to a Knowledge-Intensive One

    Get PDF
    International audienceThis paper deals with case-based machine translation. It is based on a previous work using a proportional analogy on strings, i.e., a quaternary relation expressing that "String A is to string B as string C is to string D". The first contribution of this paper is the rewording of this work in terms of case-based reasoning: a case is a problem-solution pair (A, A) where A is a sentence in an origin language and A , its translation in the destination language. First, three cases (A, A), (B, B), (C, C) such that "A is to B as C is to the target problem D" are retrieved. Then, the analogical equation in the destination language "A is to B as C is to x" is solved and D = x is a suggested translation of D. Although it does not involve any linguistic knowledge, this approach was effective and gave competitive results at the time it was proposed. The second contribution of this work aims at examining how this prior knowledge-light case-based machine translation approach could be improved by using additional pieces of knowledge associated with cases, domain knowledge, retrieval knowledge, and adaptation knowledge, and other principles or techniques from case-based reasoning and natural language processing

    Incremental dimension reduction of tensors with random index

    Get PDF
    We present an incremental, scalable and efficient dimension reduction technique for tensors that is based on sparse random linear coding. Data is stored in a compactified representation with fixed size, which makes memory requirements low and predictable. Component encoding and decoding are performed on-line without computationally expensive re-analysis of the data set. The range of tensor indices can be extended dynamically without modifying the component representation. This idea originates from a mathematical model of semantic memory and a method known as random indexing in natural language processing. We generalize the random-indexing algorithm to tensors and present signal-to-noise-ratio simulations for representations of vectors and matrices. We present also a mathematical analysis of the approximate orthogonality of high-dimensional ternary vectors, which is a property that underpins this and other similar random-coding approaches to dimension reduction. To further demonstrate the properties of random indexing we present results of a synonym identification task. The method presented here has some similarities with random projection and Tucker decomposition, but it performs well at high dimensionality only (n>10^3). Random indexing is useful for a range of complex practical problems, e.g., in natural language processing, data mining, pattern recognition, event detection, graph searching and search engines. Prototype software is provided. It supports encoding and decoding of tensors of order >= 1 in a unified framework, i.e., vectors, matrices and higher order tensors.Comment: 36 pages, 9 figure

    Conquering Language: Using NLP on a Massive Scale to Build High Dimensional Language Models from the Web

    Get PDF
    International audienceDictionaries only contain some of the information we need to know about a language. The growth of the Web, the maturation of linguistic process-ing tools, and the decline in price of memory storage allow us to envision de-scriptions of languages that are much larger than before. We can conceive of building a complete language model for a language using all the text that is found on the Web for this language. This article describes our current project to do just that

    Global Peak in Atmospheric Radiocarbon Provides a Potential Definition for the Onset of the Anthropocene Epoch in 1965

    Get PDF
    Anthropogenic activity is now recognised as having profoundly and permanently altered the Earth system, suggesting we have entered a human-dominated geological epoch, the ‘Anthropocene’. To formally define the onset of the Anthropocene, a synchronous global signature within geological-forming materials is required. Here we report a series of precisely-dated tree-ring records from Campbell Island (Southern Ocean) that capture peak atmospheric radiocarbon (14C) resulting from Northern Hemisphere-dominated thermonuclear bomb tests during the 1950s and 1960s. The only alien tree on the island, a Sitka spruce (Picea sitchensis), allows us to seasonally-resolve Southern Hemisphere atmospheric 14C, demonstrating the ‘bomb peak’ in this remote and pristine location occurred in the last-quarter of 1965 (October-December), coincident with the broader changes associated with the post-World War II ‘Great Acceleration’ in industrial capacity and consumption. Our findings provide a precisely-resolved potential Global Stratotype Section and Point (GSSP) or ‘golden spike’, marking the onset of the Anthropocene Epoch

    Parallel processes:Getting it write?

    Get PDF
    This paper offers a critical reflection on the processes surrounding the writing of a book aimed at foster carers and residential workers. By utilising the concept of parallel process as well as the four modes of reflection identified by Ruch (2000), we explore the ways in which the wider context of both direct works with children and reflective practice have been impacted by the tensions between relationally based, child-centred practice and wider managerialist imperatives. The paper draws parallels between these practice tensions and those currently in play within the academy. By employing a dialogical and reflective analysis of the process and interactions surrounding the writing of a practitioner-targeted book, the paper demonstrates the ways in which critical and process reflection post-event took place, considering the heretofore unexplored parallel processes between writing for practice, and practice. In so doing, it identifies the ways in which the authors mirrored practitioners in relation to the management of anxiety, a sense of constrained autonomy and confidence, and an avoidance of recognising and challenging structural and political context. Implications for the creation of practice literature and for the academy are considered.Output Status: Forthcoming/Available Onlin

    Ecological risk assessment of nano-enabled pesticides: a perspective on problem formulation

    Get PDF
    Plant protection products containing nanomaterials that alter the functionality or risk profile of active ingredients (nano-enabled pesticides) promise many benefits over conventional pesticide products. These benefits may include improved formulation characteristics, easier application, better targeting of pest species, increased efficacy, lower application rates, and enhanced environmental safety. After many years of research and development, nano-enabled pesticides are starting to make their way into the market. The introduction of this technology raises a number of issues for regulators, including how does the ecological risk assessment of nano-enabled pesticide products differ from that of conventional plant protection products? In this paper, a group drawn from regulatory agencies, academia, research, and the agrochemicals industry offers a perspective on relevant considerations pertaining to the problem formulation phase of the ecological risk assessment of nano-enabled pesticides.Glen W. Walker, Rai S. Kookana, Natalie E. Smith, Melanie Kah, Casey L. Doolette, Philip T. Reeves, Wess Lovell, Darren J. Anderson, Terence W. Turney, and Divina A. Navarr
    • …
    corecore